A Delayed Syntactic-Encoding-based LFG Parsing Strategy for an Indian Language - Bangla
نویسندگان
چکیده
In this squib, we propose a technique aimed at efficient computer implementation of LFG-based parsers for Indian languages in general and Bangla (Bengali) in particular. (For the LFG formalism, see Kaplan and Bresnan [1982].) The technique may also be useful for other languages having similar properties. Indian languages are mostly nonconfigurational and highly inflectional. Grammatical functions (GF's) are predicted by case inflections (markers) on the head nouns of noun phrases (NPs) and postpositional particles in postpositional phrases (PPs). However, in many cases the mapping from case marker to GF is not one-to-one. The classical technique for non-configurational syntactic encoding of GF's (Bresnan 1982b) therefore requires a number of alternations to be thrown in to handle this phenomenon. The resulting nondeterminism in the parser implementation leads to a non-efficient unification component. The problem here, however, is not of unbounded functional uncertainty (described, with proposed solutions, in Kaplan, Maxwell, and Zaenen [1987], Kaplan and Maxwell [1988], and Kaplan and Zaenan [1990]), but rather, one of disjunctive constraint satisfaction bounded within the matrix. Disjunctive constraint satisfaction leads to a degradation of efficiency of the unification component of LFG, as has been pointed out in Knight (1989) and Maxwell and Kaplan (1991). 1 A closer look at the languages reveals that most disjunctions do not exist if an a priori knowledge of the verb (which is generally at the end of the sentence, since Indian languages are mostly verb final and the verb is the last lexeme encountered in a leftto-right scan of the parser) is available. Here we propose a technique that uses this fact to reduce alternations in syntactic encoding. Our method is based on a delayed evaluation of syntactic encoding schema. We treat the points of syntactic encoding of noun phrases as forward references that are temporarily maintained in a symbol table for later binding. A new metavariable, augmentation of the scope of the Locate operator, and a special type of schema (called m-structure) to be projected by the verb are some of the salient features of our technique.
منابع مشابه
Parsing Bangla using LFG: An Introduction
This paper is introduces LFG (Lexical Functional Grammar) formalism for parsing Bangla. The LFG formalism, which has evolved from extensive computational, linguistic, and psycholinguistic research, provides a simple set of devices for describing the common properties of all natural languages and the particular properties of individual languages. This paper tabulates a set of instructions for us...
متن کاملAn improved joint model: POS tagging and dependency parsing
Dependency parsing is a way of syntactic parsing and a natural language that automatically analyzes the dependency structure of sentences, and the input for each sentence creates a dependency graph. Part-Of-Speech (POS) tagging is a prerequisite for dependency parsing. Generally, dependency parsers do the POS tagging task along with dependency parsing in a pipeline mode. Unfortunately, in pipel...
متن کاملParsing Indian Languages with MaltParser
This paper describes the application of MaltParser, a transition-based dependency parser, to three Indian languages – Bangla, Hindi and Telugu – in the context of the NLP Tools Contest at ICON 2009. In the final evaluation, MaltParser was ranked second among the participating systems and achieved an unlabeled attachment score close to 90% for Bangla and Hindi, and over 85% for Telugu, while the...
متن کاملبرچسبزنی خودکار نقشهای معنایی در جملات فارسی به کمک درختهای وابستگی
Automatic identification of words with semantic roles (such as Agent, Patient, Source, etc.) in sentences and attaching correct semantic roles to them, may lead to improvement in many natural language processing tasks including information extraction, question answering, text summarization and machine translation. Semantic role labeling systems usually take advantage of syntactic parsing and th...
متن کاملProjecting LFG F-Structures from Chunks
In this paper we pursue two related goals: First, we establish a conceptual link between chunkbased syntactic structures as typically assumed in shallow parsing approaches, as opposed to principle-based syntactic structures as assumed in theoretical linguistics research. This conceptual link emerges from the study of configurational vs. non-configurational languages, their analysis within the L...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
عنوان ژورنال:
- Computational Linguistics
دوره 23 شماره
صفحات -
تاریخ انتشار 1997